Goto

Collaborating Authors

 hierarchical deep reinforcement learning



Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

Neural Information Processing Systems

Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. One of the key difficulties is insufficient exploration, resulting in an agent being unable to learn robust policies. Intrinsically motivated agents can explore new behavior for their own sake rather than to directly solve external goals. Such intrinsic behaviors could eventually help the agent solve tasks posed by the environment. We present hierarchical-DQN (h-DQN), a framework to integrate hierarchical action-value functions, operating at different temporal scales, with goal-driven intrinsically motivated deep reinforcement learning.


Reviews: Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Neural Information Processing Systems

I believe the proposed method, HAL (Hierarchical Abstraction with Language), is an interesting approach for HRL. The authors adapt Hindsight Experience Replay for instructions (called Hindsight Instruction Relabelling). I have some concerns about the experimental setup and empirical evaluation of the proposed method: - The motivation behind introducing a new environment is unclear. There are a lot of similar existing environments such as crafting environment used by [1], compositional and relational navigation environment in [2]. Introducing a new environment (unless its necessary) hinders proper comparison and benchmarking.


Reviews: Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Neural Information Processing Systems

The additional experiments presented in the abstract addressed many of the reviewers concerns; however, there was some doubt that these changes will be successfully incorporated into a camera ready. These additions (especially the use of a full language model in the policy and the crafting world results) would significantly strengthen the paper and I strongly urge the authors to follow through on their rebuttal commitment of integrating these results in future revisions. There are also concerns that the approach is highly specialized for the environment and is limited by its need for automatic goal language prediction / verification to perform HIL. Given the content of the paper already, this might be better left to future work.


Reviews: Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

Neural Information Processing Systems

The ideas in this paper are interesting and worth pursuing. It's a very clear and sensible example of combining hierarchy with deep RL, the combination of which is of high current interest. The initial experiment on the "six state" MDP is so trivial it is uninteresting. The Montezuma's Revenge example is much nicer, demonstrating impact (albeit with a little bit of handcrafting) on a problem known to be challenging for the current state-of-the-art and would be worth seeing at NIPS. The paper is technically a little sloppy in places.


Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Neural Information Processing Systems

Solving complex, temporally-extended tasks is a long-standing problem in reinforcement learning (RL). We hypothesize that one critical element of solving such problems is the notion of compositionality. With the ability to learn sub-skills that can be composed to solve longer tasks, i.e. hierarchical RL, we can acquire temporally-extended behaviors. However, acquiring effective yet general abstractions for hierarchical RL is remarkably challenging. In this paper, we propose to use language as the abstraction, as it provides unique compositional structure, enabling fast learning and combinatorial generalization, while retaining tremendous flexibility, making it suitable for a variety of problems.


Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Jiang, YiDing, Gu, Shixiang (Shane), Murphy, Kevin P., Finn, Chelsea

Neural Information Processing Systems

Solving complex, temporally-extended tasks is a long-standing problem in reinforcement learning (RL). We hypothesize that one critical element of solving such problems is the notion of compositionality. With the ability to learn sub-skills that can be composed to solve longer tasks, i.e. hierarchical RL, we can acquire temporally-extended behaviors. However, acquiring effective yet general abstractions for hierarchical RL is remarkably challenging. In this paper, we propose to use language as the abstraction, as it provides unique compositional structure, enabling fast learning and combinatorial generalization, while retaining tremendous flexibility, making it suitable for a variety of problems.


Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

Kulkarni, Tejas D., Narasimhan, Karthik, Saeedi, Ardavan, Tenenbaum, Josh

Neural Information Processing Systems

Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. One of the key difficulties is insufficient exploration, resulting in an agent being unable to learn robust policies. Intrinsically motivated agents can explore new behavior for their own sake rather than to directly solve external goals. Such intrinsic behaviors could eventually help the agent solve tasks posed by the environment. We present hierarchical-DQN (h-DQN), a framework to integrate hierarchical action-value functions, operating at different temporal scales, with goal-driven intrinsically motivated deep reinforcement learning.


Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

Kulkarni, Tejas D., Narasimhan, Karthik, Saeedi, Ardavan, Tenenbaum, Josh

Neural Information Processing Systems

Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. One of the key difficulties is insufficient exploration, resulting in an agent being unable to learn robust policies. Intrinsically motivated agents can explore new behavior for their own sake rather than to directly solve external goals. Such intrinsic behaviors could eventually help the agent solve tasks posed by the environment. We present hierarchical-DQN (h-DQN), a framework to integrate hierarchical action-value functions, operating at different temporal scales, with goal-driven intrinsically motivated deep reinforcement learning. A top-level q-value function learns a policy over intrinsic goals, while a lower-level function learns a policy over atomic actions to satisfy the given goals. h-DQN allows for flexible goal specifications, such as functions over entities and relations. This provides an efficient space for exploration in complicated environments. We demonstrate the strength of our approach on two problems with very sparse and delayed feedback: (1) a complex discrete stochastic decision process with stochastic transitions, and (2) the classic ATARI game -- `Montezuma's Revenge'.